-
-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU: Migrate buffers on GPU project, pre-emptively flush device local mappings #6794
Conversation
… mappings Essentially retreading Ryujinx#4540, but it's on the GPU project now instead of the backend. This allows us to have a lot more control + knowledge of where the buffer backing has been changed and allows us to pre-emptively flush pages to host memory for quicker readback. It will allow us to do other stuff in the future, but we'll get there when we get there. Performance greatly improved in Hyrule Warriors: Age of Calamity. Performance notably improved in TOTK (average). Performance for BOTW restored to how it was before Ryujinx#4911, perhaps a bit better. - Rewrites a bunch of buffer migration stuff. Might want to tighten up how dispose stuff works. - Fixed an issue where the copy for texture pre-flush would happen _after_ the syncpoint. TODO: remove a page from pre-flush if it isn't flushed after a certain number of copies.
Tests done on the Ryzen 3600. Monster Hunter Rise - Master 27 FPS / PR 30 FPSNote Cleans up the remaining fps drops in more stressed areas of the game. Catherine - Master 72 FPS / PR 92Note FPS in general is higher but the more important thing is that its increasing the FPS lows from under 30 to over 30 FPS making it speed wise playable. |
Testing in amd 5500XT - Windows - Vulkan. There may be a minor regression in Star Ocean 2
|
This is very unlikely, but it's important to cover loose ends like this.
src/Ryujinx.Graphics.Gpu/Engine/Threed/ComputeDraw/VtgAsComputeState.cs
Outdated
Show resolved
Hide resolved
StorageWrite = 0x80, | ||
|
||
#pragma warning disable CA1069 // Enums values should not be duplicated | ||
StorageAtomic = 0xc0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would the compiler stop complaining if you set it to:
StorageAtomic = StorageRead | StorageWrite
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it still doesn't like that. These aren't really like flags anyways, it's more like a sequential enum shifted up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I just noticed that in other places, we use this attribute:
[SuppressMessage("Design", "CA1069: Enums values should not be duplicated")]
else | ||
{ | ||
memoryType = Vendor == Vendor.Nvidia ? | ||
SystemMemoryType.DedicatedMemorySlowStorage : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why only NVIDIA has "slow storage".
private int _flushCount; | ||
private int _flushTemp; | ||
private int _lastFlushWrite = -1; | ||
private readonly BufferAllocationType _baseType; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be removed too since its unused. Or is there a reason for keeping it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing right now I think, but it will still contain Auto
when a buffer is auto selected as HostMapped
, so it could be useful for debug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, thanks! I tested a few games here and found no issues. It's nice to see buffer migration reworked, and the backend getting simplified too.
Final perf numbers, tested in Windows 11 AMD 5500XT and Nvidia Titan X
|
Essentially retreading #4540, but it's on the GPU project now instead of the backend. This allows us to have a lot more control + knowledge of where the buffer backing has been changed and allows us to pre-emptively flush pages to host memory for quicker readback. It will allow us to do other stuff in the future, but we'll get there when we get there.
Similar to #4540 , this should only affect performance on dedicated GPUs, not integrated or mobile. There's a lot of work done here that could help separate host imported buffers from copied ones in future, but that's not really important right now. The target is mainly NVIDIA, I don't know how this will affect AMD (hopefully not negatively).
Extra
TODO stuff
- Remove the "Auto" mode from Vulkan, as it isn't used anymore.